AITopics | ground-level image

Collaborating Authors

ground-level image

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

bd8b52c2fefdb37e3b3953a37408e9dc-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 20:42:58 GMT

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts (0.04)
Europe > Sweden (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

Add feedback

Scaling Image Geo-Localization to Continent Level

Lindenberger, Philipp, Sarlin, Paul-Edouard, Hosang, Jan, Balice, Matteo, Pollefeys, Marc, Lynen, Simon, Trulls, Eduard

arXiv.org Artificial IntelligenceOct-31-2025

Determining the precise geographic location of an image at a global scale remains an unsolved challenge. Standard image retrieval techniques are inefficient due to the sheer volume of images (>100M) and fail when coverage is insufficient. Scalable solutions, however, involve a trade-off: global classification typically yields coarse results (10+ kilometers), while cross-view retrieval between ground and aerial imagery suffers from a domain gap and has been primarily studied on smaller regions. This paper introduces a hybrid approach that achieves fine-grained geo-localization across a large geographic expanse the size of a continent. We leverage a proxy classification task during training to learn rich feature representations that implicitly encode precise location information. We combine these learned prototypes with embeddings of aerial imagery to increase robustness to the sparsity of ground-level data. This enables direct, fine-grained retrieval over areas spanning multiple countries. Our extensive evaluation demonstrates that our approach can localize within 200m more than 68\% of queries of a dataset covering a large part of Europe. The code is publicly available at https://scaling-geoloc.github.io.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.26795

Country:

Europe (1.00)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

bd8b52c2fefdb37e3b3953a37408e9dc-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 15:17:50 GMT

goal modality, gomaa-geo, modality, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts (0.04)
Europe > Sweden (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

Add feedback

TaxaBind: A Unified Embedding Space for Ecological Applications

Sastry, Srikumar, Khanal, Subash, Dhakal, Aayush, Ahmad, Adeel, Jacobs, Nathan

arXiv.org Artificial IntelligenceNov-1-2024

We present TaxaBind, a unified embedding space for characterizing any species of interest. TaxaBind is a multimodal embedding space across six modalities: ground-level images of species, geographic location, satellite image, text, audio, and environmental features, useful for solving ecological problems. To learn this joint embedding space, we leverage ground-level images of species as a binding modality. We propose multimodal patching, a technique for effectively distilling the knowledge from various modalities into the binding modality. We construct two large datasets for pretraining: iSatNat with species images and satellite images, and iSoundNat with species images and audio. Additionally, we introduce TaxaBench-8k, a diverse multimodal dataset with six paired modalities for evaluating deep learning models on ecological tasks. Experiments with TaxaBind demonstrate its strong zero-shot and emergent capabilities on a range of tasks including species classification, cross-model retrieval, and audio classification. The datasets and models are made available at https://github.com/mvrl/TaxaBind.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2411.00683

Country:

Africa > Kenya (0.04)
North America > United States > Illinois (0.04)
North America > United States > Hawaii (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

GOMAA-Geo: GOal Modality Agnostic Active Geo-localization

Sarkar, Anindya, Sastry, Srikumar, Pirinen, Aleksis, Zhang, Chongjie, Jacobs, Nathan, Vorobeychik, Yevgeniy

arXiv.org Artificial IntelligenceJun-3-2024

We consider the task of active geo-localization (AGL) in which an agent uses a sequence of visual cues observed during aerial navigation to find a target specified through multiple possible modalities. This could emulate a UAV involved in a search-and-rescue operation navigating through an area, observing a stream of aerial images as it goes. The AGL task is associated with two important challenges. Firstly, an agent must deal with a goal specification in one of multiple modalities (e.g., through a natural language description) while the search cues are provided in other modalities (aerial imagery). The second challenge is limited localization time (e.g., limited battery life, urgency) so that the goal must be localized as efficiently as possible, i.e. the agent must effectively leverage its sequentially observed aerial views when searching for the goal. To address these challenges, we propose GOMAA-Geo - a goal modality agnostic active geo-localization agent - for zeroshot generalization between different goal modalities. Our approach combines cross-modality contrastive learning to align representations across modalities with supervised foundation model pretraining and reinforcement learning to obtain highly effective navigation and localization policies. Through extensive evaluations, we show that GOMAA-Geo outperforms alternative learnable approaches and that it generalizes across datasets - e.g., to disaster-hit areas without seeing a single disaster scenario during training - and goal modalities - e.g., to ground-level imagery or textual descriptions, despite only being trained with goals specified as aerial views. Code and models will be made publicly available at this link.

goal modality, gomaa-geo, modality, (14 more...)

arXiv.org Artificial Intelligence

2406.01917

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts (0.04)
Europe > Sweden (0.04)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Bird's-Eye View to Street-View: A Survey

Bajbaa, Khawlah, Usman, Muhammad, Anwar, Saeed, Radwan, Ibrahim, Bais, Abdul

arXiv.org Artificial IntelligenceMay-14-2024

In recent years, street view imagery has grown to become one of the most important sources of geospatial data collection and urban analytics, which facilitates generating meaningful insights and assisting in decision-making. Synthesizing a street-view image from its corresponding satellite image is a challenging task due to the significant differences in appearance and viewpoint between the two domains. In this study, we screened 20 recent research papers to provide a thorough review of the state-of-the-art of how street-view images are synthesized from their corresponding satellite counterparts. The main findings are: (i) novel deep learning techniques are required for synthesizing more realistic and accurate street-view images; (ii) more datasets need to be collected for public usage; and (iii) more specific evaluation metrics need to be investigated for evaluating the generated images appropriately. We conclude that, due to applying outdated deep learning techniques, the recent literature failed to generate detailed and diverse street-view images.

dataset, satellite image, street-view image, (16 more...)

arXiv.org Artificial Intelligence

2405.08961

Country:

Asia > Middle East > Saudi Arabia > Eastern Province > Dhahran (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
(5 more...)

Genre: Research Report > New Finding (0.49)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GEOBIND: Binding Text, Image, and Audio through Satellite Images

Dhakal, Aayush, Khanal, Subash, Sastry, Srikumar, Ahmad, Adeel, Jacobs, Nathan

arXiv.org Artificial IntelligenceApr-17-2024

In remote sensing, we are interested in modeling various modalities for some geographic location. Several works have focused on learning the relationship between a location and type of landscape, habitability, audio, textual descriptions, etc. Recently, a common way to approach these problems is to train a deep-learning model that uses satellite images to infer some unique characteristics of the location. In this work, we present a deep-learning model, GeoBind, that can infer about multiple modalities, specifically text, image, and audio, from satellite imagery of a location. To do this, we use satellite images as the binding element and contrastively align all other modalities to the satellite image data. Our training results in a joint embedding space with multiple types of data: satellite image, ground-level image, audio, and text. Furthermore, our approach does not require a single complex dataset that contains all the modalities mentioned above. Rather it only requires multiple satellite-image paired data. While we only align three modalities in this paper, we present a general framework that can be used to create an embedding space with any number of modalities by using satellite images as the binding element. Our results show that, unlike traditional unimodal models, GeoBind is versatile and can reason about multiple modalities for a given satellite image input.

ground-level image, modality, satellite image, (16 more...)

arXiv.org Artificial Intelligence

2404.1172

Country:

North America > United States > Missouri > St. Louis County > St. Louis (0.05)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Cross-Modal Learning of Housing Quality in Amsterdam

Levering, Alex, Marcos, Diego, Tuia, Devis

arXiv.org Artificial IntelligenceMar-13-2024

In our research we test data and models for the recognition of housing quality in the city of Amsterdam from ground-level and aerial imagery. For ground-level images we compare Google StreetView (GSV) to Flickr images. Our results show that GSV predicts the most accurate building quality scores, approximately 30% better than using only aerial images. However, we find that through careful filtering and by using the right pre-trained model, Flickr image features combined with aerial image features are able to halve the performance gap to GSV features from 30% to 15%. Our results indicate that there are viable alternatives to GSV for liveability factor prediction, which is encouraging as GSV images are more difficult to acquire and not always available.

amsterdam, dataset, quality score, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3486635.3491067

2403.08915

Country:

Europe > Netherlands > North Holland > Amsterdam (0.63)
North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Creating Ground-level Views from Satellite Imagery

#artificialintelligenceOct-6-2018, 05:23:17 GMT

Many techniques, using statistics or artificial intelligence, exist that help classify and identify areas on satellite imagery. This includes land use characteristics such as urban spaces, agriculture lands, forests, etc. However, recreating a ground-level image and perspective using satellite imagery has only recently been developed and is now an active area of research. Such work has the potential to not only classify land more accurately but it can also provide a ground-level perspective that indicates how it differs or is like other similar classes. One pioneering technique developed in providing ground-level views from satellite images was developed by the University of California, Merced.

artificial intelligence, ground-level view, machine learning, (12 more...)

#artificialintelligence

Country: North America > United States > California > Merced County > Merced (0.26)

Genre: Research Report (0.51)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.33)

Add feedback

Given a satellite image, machine learning creates the view on the ground

#artificialintelligenceJul-9-2018, 12:36:30 GMT

Leonardo da Vinci famously created drawings and paintings that showed a bird's eye view of certain areas of Italy with a level of detail that was not otherwise possible until the invention of photography and flying machines. Indeed, many critics have wondered how he could have imagined these details. But now researchers are working on the inverse problem: given a satellite image of Earth's surface, what does that area look like from the ground? How clear can such an artificial image be? Today we get an answer thanks to the work of Xueqing Deng and colleagues at the University of California, Merced.

artificial intelligence, machine learning, satellite image, (13 more...)

#artificialintelligence

Country:

North America > United States > California > Merced County > Merced (0.25)
Europe > Italy (0.25)

Genre: Research Report (0.36)

Industry: Media > Photography (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.38)

Add feedback